Data analysis and visualisation

Model preparation

Before applying algorithm we should check whether the data is equally splitted or not, because if data is not splitted equally it will cause for data imbalacing problem

Machine learning models

Naive Bayes, Random Forest Classifier, K-Nearest Neighbour, Decision Tree, Support Vector Machine, Multinominal Logistic Regression, Extreme Gradient Boost

While training the Naive Bayes model in the dataset, we achieved an accuracy rate of 78%, which is more than the baseline accuracy. For classifying class “0” (i.e.Dementia) the algorithm achieved a recall value of 1 and an F1 score of 0.73 and for class "1"(MCI) the recall value of 0.6 and F1 score of 0.67

The Random Forest model attained an accuracy rate of 89%, which is more than the Naive Bayes model. Moreover, the precision for identifying Dementia is 1 with an F1 score of 0.78.

The KNN model achieved an accuracy rate of 83%, which is slightly less than the Random Forest model.The model has precision of 1 and f1-scores of 0.78 for identifying the class “0” (Dementia)

The DecisionTree model achieved an accuracy rate of 83%, which is slightly less than the Random Forest model.

SVM shows lower accuracy than random forest model, which is 86%.

Model Evaluation

Conclusion Random Forest gives the best Accuracy compared to other models.